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Abstract 

We present an exact algorithm that decides, for every fixed r > 2 in time 
0(m) + 2° (k > whether a given multiset of m clauses of size r admits a truth 
assignment that satisfies at least ((2 r — l)m + k)/2 T clauses. Thus MAX-r- 
Sat is fixed-parameter tractable when parameterized by the number of satisfied 
clauses above the tight lower bound (1 — 2~ r )m. This solves an open problem 
of Mahajan, Raman and Sikdar (J. Comput. System Sci., 75, 2009). 

Our algorithm is based on a polynomial-time data reduction procedure that 
reduces a problem instance to an equivalent algebraically represented problem 
with 0(k 2 ) variables. This is done by representing the instance as an appropri- 
ate polynomial, and by applying a probabilistic argument combined with some 
simple tools from Harmonic analysis to show that if the polynomial cannot be 
reduced to one of size 0(k 2 ), then there is a truth assignment satisfying the 
required number of clauses. 

We introduce a new notion of bikernelization from a parameterized problem 
to another one and apply it to prove that the above-mentioned parameterized 
MAX-r-SAT admits a polynomial-size kernel. 

Combining another probabilistic argument with tools from graph matching 
theory and signed graphs, we show that if an instance of Max-2-Sat with m 
clauses has at least 3k variables after application of certain polynomial time 
reduction rules to it, then there is a truth assignment that satisfies at least 
(3m + k)/4 clauses. 

We also outline how the fixed-parameter tractability and polynomial-size 
kernel results on MAX-r-SAT can be extended to more general families of Boolean 
Constraint Satisfaction Problems. 

*A preliminary version of this paper is to appear in the proceedings of ACM-SIAM Symposium 
on Discrete Algorithms (SODA 2010). 



1 Introduction 



The Maximum r- Satisfiability Problem (Max-t--Sat) is a classic optimization prob- 
lem with a wide range of real-world applications. The task is to find a truth as- 
signment to a multiset of clauses, each with exactly r literals, that satisfies as many 
clauses as possible, or in the decision version of the problem, to satisfy at least t 
clauses where t is given with the input. Even Max-2-Sat is NP-hard jT(J] and hard 
to approximate |15j . in strong contrast with 2-Sat which is solvable in linear time [3J. 

It is always possible to satisfy a 1 — 2~ r fraction of a given multiset of clauses with 
exactly r literals each; a truth assignment that meets this lower bound can be found 
in polynomial time by Johnson's algorithm [19) . This lower bound is tight in the sense 
that it is optimal for an infinite sequence of instances. In this paper we show that 
for every fixed r we can decide in time 0(m) + 2°( fe ' whether a given multiset of m 
clauses admits a truth assignment that satisfies at least ((2 r — l)m + k)/2 r clauses. 
Thus, Max-t-Sat is fixed-parameter tractable when parameterized by the number 
of satisfied clauses above the tight lower bound; this answers a question posed by 
Mahajan, Raman and Sikdar [2"T] . 

Our algorithm described in Section|3]is based on a polynomial-time data reduction 
procedure that reduces a problem instance to an equivalent algebraically represented 
problem with 0(fc 2 ) variables. This is done by representing the instance as an ap- 
propriate polynomial, and by applying a probabilistic argument combined with some 
simple tools from Harmonic analysis to show that if the polynomial cannot be re- 
duced to one of size 0(k 2 ), then there is a truth assignment satisfying the required 
number of clauses. The basic approach is based on the ideas of [T], and a similar one 
which, however, does not apply any algebraic reductions, was used in |ll[|12j to show 
the existence of quadratic kernels for other problems parameterized above tight lower 
bounds. 

We also show that the above-mentioned parameterized MAX-r-SAT admits a 
polynomial-size kernel. This can be deduced from our fixed-parameter result and 
a general lemma proved in Section [31 and can also be proved by a more efficient, 
direct argument. The lemma, which is interesting in its own right, links a new con- 
cept that we call bikernelization with the well-known concept of kernelization. We 
believe that bikernelization, in general, and the lemma, in particular, will have further 
applications. 

In Section O combining another probabilistic argument with tools from graph 
matching theory and signed graphs, we show that if an instance 1 of Max-2-Sat 
on m clauses has at least 3fc variables after application of certain polynomial time 
reduction rules to it, then there is a truth assignment for X that satisfies at least 
(3m + fc)/4 clauses. Thus, Max-2-Sat admits a problem kernel with at most 3fc — 1 
variables. 

Section [6] is devoted to discussions. In particular, we outline how the fixed- 
parameter tractability and polynomial-size kernel results on MAX-r-SAT can be ex- 
tended to more general families of Boolean Constraint Satisfaction Problems. 

Related Work Parameterization^ above a guaranteed value were first considered by 
Mahajan and Raman [20] for the problems Max-Sat and Max-Cut. They devised 
an algorithm for Max-Sat with running time 0*(1.618 fc -(-^™ i |C*|) that finds, for a 
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multiset {Ci, . . . , C m } of m clauses, a truth assignment satisfying at least \m/2] + k 
clauses, or decides that no such truth assignment exists (\Ci\ denotes the number of 
literals in Ci). In a recent paper [21] Mahajan, Raman and Sikdar argued that a 
natural (and challenging) parameter for a maximization problem is the number of 
clauses satisfied above a tight lower bound, which is (1 — 2~ r )m for Max-Sat if each 
clause contains exactly r different variables. Only a few non-trivial results are known 
for problems parameterized above a tight lower bound [HI [T21 [TH [T71 [20] . 

Mahajan et al. [21] state several problems parameterized above a tight lower bound 
whose parameterized complexity is open. One of the problems is the (exact) Max- 
r-SAT problem (an instance consists of m clauses, each containing exactly r different 
literals) parameterized by the number of satisfied clauses above the tight lower bound 
(1 — 2~ r )m. Our main result answers this question. 

2 Preliminaries 

We assume an infinite supply of propositional variables. A literal is a variable x or 
its negation x. A clause is a finite set of literals not containing a complementary 
pair x and x. A clause is of size r if it contains exactly r literals. For simplicity of 
presentation, we will denote a clause by a sequence of its literals. For example, the 
clause {x, y} will be denoted xy or equivalently yx. A CNF formula F is a finite 
multiset of clauses (a clause may appear in the multiset several times). A variable 
x occurs in a clause if the clause contains x or x, and x occurs in a CNF formula F 
if it occurs in some clause of F. Let var(C) and var(F) denote the sets of variables 
occurring in C and F, respectively. A CNF formula is an r-CNF formula if |C| = r 
for all C £ F. Thus we require that each clause of a r-CNF formula contains exactly r 
different literals (some authors use for that the term exact r-CNF). A truth assignment 
is a mapping r: V — > {— 1, 1} defined on some set V of variables. We write 2 V to 
denote the set of all truth assignments on V. A truth assignment r satisfies a clause 
C if there is some variable x S C with t(x) = 1 or a negated variable x G C with 
t(x) = —1. We write sat(r, F) for the number of clauses of F that are satisfied by r, 
and we write 

sa,t(F) — max sat(r, F). 

A parameterized problem is a subset LC S*xN over a finite alphabet S. L is fixed- 
parameter tractable if the membership of an instance (a;, k) in E* x N can be decided in 
time III ' 1 ) • f(k) where / is a computable function of the parameter [8l [9j [22] . Given 
a parameterized problem L, a kernelization of L is a polynomial-time algorithm that 
maps an instance {x, k) to an instance (x',/c') (the kernel) such that (i) (x, k) G L 
if and only if (x',k ! ) S L, (ii) k' < f(k), and (iii) l^'l < g(k) for some functions / 
and g. The function g(k) is called the size of the kernel. A parameterized problem is 
fixed-parameter tractable if and only if it is decidable and admits a kernelization [pj. 

We shall consider the following parameterized version of MAX-r-SAT. 
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MAX-r-SAT ABOVE Tight Lower Bound (or MAX-r-SAT TLB for short) 

Instance: A pair [F, k) where F is a multiset of m clauses of size r and 
k is a nonnegative integer. 

Parameter: The integer k. 

Question: Is sat(F) > ((2 r - l)m + k)/2 r l 

We note that Mahajan et al. [21] use a slightly different formulation of the problem, 
asking for an assignment that satisfies at least (1 — 2~ r )m + k clauses; since r is fixed, 
this change does not affect the complexity of the problem. 

We will also refer to the following special case of another problem introduced by 
Mahajan et al. [2Tj . 



Max t-Lin2 above Tight Lower Bound (or Max-7'-Lin2 tlb for 
short) 

Instance: A system of m linear equations ex, . . . , e m in n variables over 
F2, where no equation has more than r variables, and each equation ej 
has a positive integral weight Wj, and a nonnegative integer k. 

Parameter: The integer k. 

Question: Is there an assignment of values to the n variables such that 
the total weight of the satisfied equations is at least (W + k)/2, where 
W = wx H Vw m l 



Note that trivially W/2 is indeed a tight lower bound for the above problem, as 
the expected number of satisfied equations in a random assignment is W/2, and if 
the equations come in identical pairs with contradicting free terms, no assignment 
satisfies more equations. It was proved in [12] that MAX-r-LiN2 TLB admits a kernel 
with 0(k 2 ) equations and variables. 

3 Bikernelization 

In this section we introduce a new notion of a bikernelization and study its basic 
properties. A bikernelization from L to L' is of interest especially when L' is a well- 
studied problem. 

Given a pair L, II of parameterized problems, a bikernelization from L to L' is a 
polynomial-time algorithm that maps an instance (x,k) to an instance [x',k') (the 
bikernel) such that (i) (x.k) G L if and only if (x',k') £ L', (ii) k' < f(k), and 
(iii) \x'\ < g(k) for some functions / and g. The function g{k) is called the size of 
the bikernel. Observe that a kernelization of a parameterized problem L is simply a 
bikernelization from L to itself, i.e., a bikerenelization generalizes a kernelization. 

Recall that a parameterized problem is fixed-parameter tractable if and only if it 
is decidable and admits a kernelization. This result can be extended as follows: A 
parameterized problem L is fixed-parameter tractable if and only if it is decidable and 
admits a bikernelization from itself to a parameterized problem V . Indeed, if L is 
fixed-parameter tractable, then L is decidable and admits a bikernelization to itself. 
If L is decidable and admits a bikernelization from itself to a parameterized problem 
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L' , then (a;, k) can be decided by first mapping it to [x 1 , k') in polynomial time and 
then deciding (x 1 , k') in time depending only on k' , and thus only on k. 

We are especially interested in cases when kernels are of polynomial size. The 
next lemma is similar to Theorem 3 in [5]. 

Lemma 1. Let L,L' be a pair of parameterized problems such that L' is in NP and 
L is NP-complete. If there is a bikernelization from L to V producing a bikernel of 
polynomial size, then L has a polynomial- size kernel. 

Proof. Consider a bikernelization from L to L' that maps an instance (x, k) 6 L to an 
instance (x',k') 6 L' with k! < f(k). Since L' is in NP and L is NP-complete, there 
exists a polynomial time reduction from L' to L. Thus, we can find in polynomial 
time an instance {x" ,k") of L which is decision-equivalent with (x / ,k J ), and in turn 
with (x,k). Observe that \x"\ < \x'\°^ < fc « and k" < (fc')° (1) + (|a/|)°M < 
+ fc° (1) . Thus, {x", k") is a kernel of L of polynomial size. □ 

4 MAX-r-SAT 

4.1 An Algebraic Representation 

Let F be an r-CNF formula with clauses C±, . . . , C m in the variables x±, x<i, . . . , x n . 
For F, consider 

C&F XiGvar(C) 

where G { — 1,1} and e% — — 1 if and only if xi is in C. 

Lemma 2. For a truth assignment r, we have X = 2 r (sat(r, F) — (1 — 2~ r )m). 

Proof. Observe that E^eva^c) 0- + £ i a '-t) equals 2 r if C is falsified and 0, otherwise. 
Thus, X = m — 2 r (m — sat(r, F)) implying the claimed formula. □ 

After algebraic simplification X — X(xi, x%, . . . , x n ) can be written as X — 
where Xi — cj Y[ i£ j %i, each c/ is a nonzero integer and S is a family 
of nonempty subsets of {1, . . . , n} each with at most r elements. 

The question we address is that of deciding whether or not there are values Xi € 
{ — 1, 1} so that X = X(xi, X2, ■ • ■ , x n ) > k. The idea is to use a probabilistic argument 
and show that if the above polynomial has many nonzero coefficients, that is, if |«S| 
is large, this is necessarily the case, whereas if it is small, the question can be solved 
by checking all possibilities of the relevant variables. 

4.2 The Properties of X 

In what follows, we assume that each variable Xi takes its values randomly and in- 
dependently in {—1, 1} and thus X is a random variable. Our approach is similar to 
the one in pQ. For completeness, we reproduce part of the argument (modifying it a 
bit and slightly improving the constant for the case considered here). We need the 
simple lemma. 
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Lemma 3 (see, e.g., (T], Lemma 3.1). For every real random variable X with finite 
and positive forth moment, 

E{X 2 fl 2 



MX) > 



E(X 4 )V2 



The above lemma implies the following (see pQ, Lemma 3.2, part (ii) for a similar 
result). 

Corollary 1. Let X be a real random variable and suppose that its first, second and 
forth moments satisfy E(X) = 0, E(X 2 ) = a 2 > and E(X 4 ) < ba 4 . Then 

¥(X > -^=) > 0. 
" 2y/b 

Proof. By Lemma GO E(|X|) > Since E(X) = it follows that 

F(X > 0)E(X\X > 0) > (1) 

Therefore, X must be at least a/(2y/b) with positive probability. □ 
We also use the following lemma of Bourgain [7] . 

Lemma 4 ([7j). Let f = f{x\, . . . ,x n ) be a polynomial of degree r in n variables 
X\, . . . ,x n with domain {—1,1}. Define a random variable X by choosing a vector 
(ei, . . . , £„) G { — 1, 1}™ uniformly at random and setting X — f(si, . . . , £„). Then, 
E(X 4 ) < 2 6r (E(X 2 )) 2 . 

Returning to the random variable X = X{x\, X2, ■ ■ ■ , x n ) defined in the previous sub- 
section, we prove the following. 

Lemma 5. Let X = y^ re<; Xj, where Xi = ciY[ i€ jXi is as in the previous subsec- 
tion, and assume it is not identically zero. Then E(X) = 0, E(X 2 ) = ^2 Ie gC 2 > 
\S\ > andE(X 4 ) < 2 6r E(X 2 ) 2 . 

Proof. Since the a;,'s are mutually independent, E(X) = 0. Note that for I,Je S, 
I 7^ J, we have E(XiXj) — cicjE(Y[ ieIA j Xi) = 0, where / A J is the symmetric 
difference of / and J. Thus, E(X 2 ) = E Je5 cf. By LemmaH E(X 4 ) < 2 6r E(X 2 ) 2 . 

□ 

4.3 The Main Result for General r 

Theorem 1. The problem MAX-r-SAT TLB is fixed-parameter tractable and can be 
solved in time O(m) + 2°( k '. Moreover, there exist (i) a polynomial- size bikernel 
from Max-7"-Sat TL b to MAX-r-LiN2 TLB , and (ii) a polynomial- size kernel of Max- 
r-SAT TLB . In fact, there are such a bikernel and a kernel of size 0{k 2 ). 

Proof. By Lemma [2] our problem is equivalent to that of deciding whether or not 
there is a truth assignment to the variables Xi,x%, . . . , x n , so that 

X(xt,...,x n )>k. (2) 
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Note that in particular this implies that if X is the zero polynomial, then any truth 
assignment satisfies exactly a (1 — 2~ r ) fraction of the original clauses. By Corollary[T] 

and LemmaEl P(X > > 0, where b = 2 6r and E(X 2 ) = £ JgS c ? ^ \ s \l the 

last inequality follows from the fact that each \cj\ is a positive integer. Therefore 

W(X > ^p) > 0. Now, if k < then there are x { E {-1, 1} such that © holds, 
and there is an assignment for which the answer to MAX-r-SAT TLB is Yes. Otherwise, 
|6>| = 0(k 2 ), and in fact even J2 IeS |c/| < YIips c ? = 0(k 2 )> tna ^ i s i the total number 
of terms of the simplified polynomial, even when counted with multiplicities, is at most 
0{k 2 ). 

For any fixed r, the representation of a problem instance of m clauses as a poly- 
nomial, and the simplification of this polynomial, can be performed in time 0(m). 
If the number of nonzero terms of this polynomial is larger than 4 • 8 2r k 2 , then the 
answer to the problem is Yes. Otherwise, the polynomial has at most 0(k 2 ) terms 
and depends on at most 0(k 2 ) variables, and its maximum can be found in time 
2C(fe 2 ). 

This completes the proof of the first part of the theorem. We next establish the 
second part. Given the simplified polynomial X as above, define a problem in MAX-r- 
Lin2 tlb with the variables z\, Z2, ■ • ■ , z n as follows. For each nonzero term cj Yiiei x i 
consider the linear equation X^iei Zi ~ ^> wnere & = if cj is positive, and b = 1 if cj 
is negative, and either associate this equation with the weight wj = |cj|, or duplicate 
it |c/| times. It is easy to check that this system of equations has an assignment z^ 
satisfying at least E/es wj + k]/2 of the equations if and only if there are Xi <E {—1, 1} 
so that X(x\ 1 X2, ■ ■ ■ ,x n ) > k. This is shown by the transformation Xi = (— l) 2i . See 
also [TB] and [12] for a similar discussion. Since, as explained above, we may assume 
that X)/e5 l C/ l = 0{k 2 ) (as otherwise we know that the answer to our problem is 
Yes), this provides the required bikernel of size 0(k 2 ) to MAX-r-LlN2 TLB . 

It remains to prove the existence of a polynomial size kernel for the original prob- 
lem. One way to do that is to apply Lemma Q] Indeed, MAX-r-LiN2 TLB is in NP, and 
Max-7*-Sat tlb is NP-complete, implying the desired result. 

It is also possible to give a direct proof, which shows that the problem admits a 
kernel of size at most 0(k 2 ). To do so, we replace each linear equation of at most 
r variables by a set of 2 r ~ 1 clauses, so that if the variables Zj satisfy the equation, 
the same Boolean variables Xi — Zi satisfy all these clauses, and if the variables z^ 
do not satisfy the equation, then the variables Xi above satisfy only 2 r ~ 1 — 1 of the 
clauses. This is done as follows. Consider, first, a linear equation with exactly r 
variables. After renumbering the variables, if needed, a typical equation is of the 
form z\ + Z2 + ■ ■ ■ + z r = b, where the sum is over F2 and b £ {0, 1}. There are 
exactly 2 r_1 Boolean assignments 5 = (61,62, ■ ■ ■ ,S r ) for the variables Zi that do not 
satisfy the equation. For each such assignment 8 let C$ be the clause consisting of 
r literals, where the literal number i is Xi if Si — and is xi if Si = 1. Note that if 
the variables z%, Z2, ■ ■ ■ , z r satisfy the above equation, then (zi, Z2, ■ ■ ■ , z r ) is not one 
of the vectors S considered, and hence each of the clauses Cs constructed contains at 
least one satisfied literal when Xi — Zi. Therefore, in this case all clauses are satisfied. 
A similar argument shows that if the variables Zi do not satisfy the equation, there 
will be exactly one non-satisfied clause, namely the one corresponding to the vector 
S = (z\,Z2, ■ ■ ■ , z r ). The construction can be extended to equations with less than r 
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variables. Indeed, the only property used in the transformation above is that there are 
exactly 2 r_1 Boolean assignments for the variables z±, Z2, ■ ■ ■ , z r that do not satisfy 
the equation. If the equation has only (1 <) s < r variables, add to these variables 
an arbitrary set of r — s of the other variables, and consider the set of all Boolean 
assignments to this augmented set of variables that do not satisfy the equation. Here, 
too, there are exactly 2 r ~ 1 such assignments and we can thus repeat the construction 
above in this case as well. 

The above procedure transforms a set of W linear equations over F2 into a multiset 
of 2 r ~ 1 W clauses. Moreover, if some truth assignment does not satisfy exactly i 
equations, then the same assignment does not satisfy the same number, £, of clauses. 
In particular, there is an assignment satisfying all equations but (W — k)/2 of them, 
if and only if there is an assignment satisfying all clauses but (W — k) /2 of them. 
This means that among the m = 2 r ~ 1 W clauses, the number of satisfied ones is 
m-(W-k)/2 = [{2 r - l)m + 2 r - 1 k]/2 r . This reduces an instance of MAX-r-LiN2 TLB 
with W equations and parameter k to an instance of MAX-r-SAT TLB with 2 r ~ 1 W 
clauses and parameter 2 r ~ 1 k. Since r is a constant, this provides the required kernel 
of size 0(k 2 ), completing the proof. □ 



5 Max-2-Sat 

In this section we describe an alternative, more combinatorial, approach to the prob- 
lem for r = 2. Although this approach is somewhat more complicated than the one 
discussed in the previous section, it provides an additional insight to this special case 
of the problem, and allows us to obtain a linear kernel for Max-2-Sat tlb . 
We start with a simple reduction rule that applies to any value of r. 



5.1 The Semicomplete Reduction 

We say that a pair of distinct clauses Y and Z has a conflict if there is a literal p G Y 
such that p G Z. We say that an r-CNF formula F is semicomplete if the number of 
clauses is m = 2 r and every pair of distinct clauses of F has a conflict. A semicomplete 
r-CNF formula is complete if each clause is over the same set of variables. There are 
r-CNF formulas that are semicomplete but not complete; consider for example {xy, 
xy, xz, xz}. We have the following: 

Lemma 6. Every truth assignment to a semicomplete r-CNF formula satisfies exactly 
2 r — 1 clauses. 

Proof. Let S be a semicomplete r-CNF formula. To prove that no truth assignment 
satisfies all clauses of S we use the following simple counting argument from |18j . 
Observe that every clause is not satisfied by exactly 2 n ~ r truth assignments. However, 
each of these assignments satisfies each other clause (due to the conflicts). So, we have 
exactly 2 r ■ 2 n ~ r truth assignments not satisfying F. But 2 r ■ 2 n ~ r — 2™, the total 
number of truth assignments. 

Now let t be a truth assignment of S. By the above, r does not satisfy a clause 
C of F. However, t satisfies any other clause of S as any other clause has a conflict 
with C. □ 
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Consider the following data reduction procedure. 

Given an r-CNF formula F that contains a semicomplete subset F' C F, delete 
F' from F and consider F \ F' instead. Let F s denote the formula obtained from 
F by applying this deletion process as long as possible. We say that F° is obtained 
from F by semicomplete reduction. 

We state the following two simple observations as a lemma. 

Lemma 7. Let F be an r-CNF formula. 

1. F s can be obtained from F in polynomial time. 

2. sat(F) - sat(F s ) = (1 - 2^ r )(l^l - l^ S D- 

5.2 Kernelization 

Let F be a 2-CNF formula. A variable x S var(F) is insignificant if for each literal 
y the numbers of occurrences of the two clauses xy and xy in F are the same. A 
variable x £ var(_F) is significant if it is not insignificant. A literal is significant or 
insignificant if its underlying variable is significant or insignificant, respectively. 

Theorem 2. Let F be a 2-CNF formula with F — F s (i.e., F contains no semi- 
complete subsets) and let k > be an integer. If F has more than 3fc — 2 significant 
variables, then sat(F) > (3|F| + k)/4. 

The remainder of this section is devoted to the proof of Theorem [5] and its corol- 
laries. Let F be a 2-CNF formula with m clauses and n variables and let k be an 
integer. We assume that F contains no semicomplete subsets, i.e., F = F s . 

For a literal x let c{x) denote the number of clauses in F containing x. Given a 
pair of literals x and y, x ^ y, let c(xy) be the number of occurrences of clause xy 
in F. 

Given a clause C G F and a variable x <E var(F), let b~c(x) be an indicator 
variable whose value is set as Sc(x) = 1 if x € C, 8c{x) = — 1 if x € C, and Sc(x) = 
otherwise. 

Lemma 8. For each subset R = {x\, . . . , x q } C var(F) we have sat(F) > (3m+fc_R)/4 
for 

k R = } (c(Xj) - c(Xj)) + } (c(XiXj) + c(xlXj) - c(XiXj) - c(xtxj)) . 

l<i<q l<*<i<« 

Proof. Take a random truth assignment r € 2 var ( F ) such that r(xi) = 1 for all 
i € {1, . . . , q} and P(t(x) = 1) = 0.5 for all x £ var(F) \ R. A simple case analysis 
yields that the probability that a clause C € F is satisfied by r is given by 

P(r satisfies C) = 1 - - J[ (1 - S c (xi)). 

l<i<q 

Observe that for any clause C and any three distinct variables x, y, z we have 
5c(x)Sc(y)Sc{z) = as var(C) contains exactly two variables. Hence we can deter- 
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mine the expected number of clauses satisfied by r as follows. 
E(sat(r, F)) = ^ P[ r satisfies C } 

CEF 

= E^-s n (i-M*o)} 

C6F l<i<g 

= 1 m + 1 £ { £ M&i)- E M^)M^)} 

l<i< ? C£f l<i<]<qCGF 

3 1, 

4 4 

We construct an auxiliary graph G = (V, E) from F by letting V = var(F) and 
xy e -E if and only if there exists a clause G G F with var(C) = {x, y} (equivalently, 
c(xy) + c(xy) + c(xy) + c(xy) > 1). 

We assign a weight to each vertex x and edge xy of G = (V, E) : 

w(x) := = c(x) — c(x), 

w(xy) := ~ ^ fcMfc(if) = c(xy) + c(xy) - c(xj/) - c(xy). 
ceF 

For subsets U C V and H Q E, let w(J7) = X^ec/ 1 ^) an< ^ 10 (-^0 — J2 xy eH w ( x y)- 
The weight w(Q) of a subgraph Q = (U,H) is u>([7) + w(H). Let G° be the graph 
obtained from G by removing all edges of weight zero. 

Lemma 9. ^4 variable x G var(F) is insignificant if and only if x is an isolated vertex 
in G° and w(x) = 0. 

Proof. Suppose x G v&r(F) is insignificant. Choose an edge xy G E (this is possible 
since by construction G has no isolated vertices). Since x is insignificant, c{xy) = 
c(xy) and c(xy) = c(xy) and thus w(xy) = 0. Therefore the edge xy does not appear 
in G° and x is isolated in G°. Observe that we have c(x) = c(x), which implies 
w(x) = 0. 

Suppose x G var(F) is an isolated vertex of G° and w(x) = 0. Since G has no 
isolated vertices, we have w(xy) = for all xy G £7. In order to derive a contradiction, 
let us suppose x is a significant variable of F. Consequently there is (i) either a 
clause xy E F such that c(xy) > c(xy), or (ii) there is a clause xy G F such that 
c(xy) > c{xy). We consider case (i) only, case (ii) can be treated analogously. With 
w(xy) — 0, we have c(xy) > c(xy), and thus xy G F. 

Now the condition w(x) = c(x) — c(x) = implies the existence of an edge xz G E 
with z ^ y such that for some z' G {z,z} we have xz' G F and c(xz') > c(xz'). 
Without loss of generality, assume that z' = z. Since w(xz) — 0, we have xz G F. 
However, the four clauses xy, xy, xz, xz in F form a semicomplete 2-CNF formula, 
which contradicts our assumption that F = F s . Hence x is indeed an insignificant 
variable. □ 
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For a set X C var(F) we let Fx denote the 2-CNF formula obtained from F by 
replacing x with x and x with x for each x G X. We say that -Fx is obtained from F 
by switching X. 

The following lemma follows immediately from the definitions of switch and weights. 

Lemma 10. The auxiliary graph Gx corresponding to Fx can be obtained from 
G = (V, E) by reversing the signs of the weights of all vertices in X and all edges 
between X and V\X. Moreover, sat(F) = sat(Fx). 

To distinguish between weights in G and Gx, we use wx(-) for weights of Gx- 
Similarly, we use cx{-) for Fx- 

It is sometimes convenient to stress that the set X we are switching induces a 
subgraph. Wc can switch an induced graph Q by switching all the vertices of Q. 
Observe that by switching an induced graph Q, we reverse the signs of weights on all 
vertices of Q and all edges incident with exactly one vertex of Q, but the sign of each 
edge within Q remains unchanged. This property will play a major role to show that 
a certain structure meets the condition of the following lemma. 

Lemma 11. If there exist a set X c V(G°) and an induced subgraph Q = (U, H) of 
G° with w x (Q) > k, then sat(F) > (3m + fc)/4. 

Proof. We consider U — {x%, . . . ,x q } as a subset of vai(Fx)- By Lemmas 151 and [TU1 
sat(F ) = sat (Fx) > (3m + fe[/)/4, where 

q 

ku = ^(c x (xt) - c x {xl)) + ^ (cx(xiX~) + c x (x^Xj) - c x (x l x j ) - c x (x7xj)) 

i=l 1<*<J<9 
1 

= ^2 Wx(xiXj) = w x {Q) > k. □ 

i— 1 l^*<i^9 

To apply Lemma [TT] in the proof of Theorem ® we will focus on a special case 
of induced subgraphs of G°. For a set U C V(G°), let G°[U) denote the subgraph 
of G° induced by U. We call G°[U] an induced star with center a; if a; is a vertex of 
G°, / is an independent set in the subgraph of G° induced by the neighbors of x and 
U = {x} U /. We are interested in the induced star due to the following property. 

Lemma 12. Let x be the center of an induced star Q — G°[U] and let I = U \ {x}. 
Then there is a set X C U such that wx(Q) > Ill- 
Proof. Let H be the set of edges of Q. We may assume that w(xy) > for each y E I 
since otherwise we can switch y. By a random switch of Q, we mean a switch of Q 
with probability 0.5. Take a random switch R of Q. Then we have E(w^(z)) = for 
all z £ U . Note that the sign of each edge in H remains positive. Hence we have 
E(u>_r(Q)) = w(H) > \I\ and thus there exists a set X C U for which wx{Q) d 

If we are given more than one induced star, a sequence of random switches gives 
us a similar result. 

Lemma 13. Let Q\ ~ (U\, Hi), . . . , Q m = (U m , H m ) be a collection of vertex- disjoint 
induced stars of G° with centers x\, . . . ,x rn , let U = U£Li^i> an< ^ ^ Q = G a [U]. 
Then there is a set X C U such that wx(Q) > where Ii = Ui\ {x{\, 

i = 1, ... ,171. 
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Proof. As in the proof of Lemma [T^l we may assume that all the edges of Hi have 
positive weights. Let H be the set of edges of Q. By a random switch of Q, we mean a 
sequence of switches of Q\, . . . , Q m each with probability 0.5. Take a random switch 
R of Q. Then we have ¥,(wr(x)) = for all x E U. Moreover, for the subgraph 
Q of G° R , it holds that M(wn(xy)) = for all xy E H \ [J™ 1 Hi since each choice 
of wn(xy) > and wn(xy) < is equally likely. By linearity of expectation and 
Lemma [T2l we have E(wr(Q)) = w({J" =1 H) > Y^iLi \h\ an d thus there exists a set 
XC U for which w x (Q) >EZi ^ □ 

We are now in the position to complete the proof of Theorem [2] 
Suppose that (F,k) is a no-instance, i.e., sat(F) < (3m + fc)/4. Notice that a 
matching can be viewed as a collection of induced stars of G° for which |Jj| = 1. It 
follows by Lemmas [TT] and [T3] that G° has no matching of size k. The Tutte-Berge 
formula [H [5] states that the size of a maximum matching in G° equals 

min h\V(G°)\ + \S\-oc(G°-S)} 

SCV(G°) 2 

where oc(G° — 5) is the number of odd components (connected components with 
an odd number of vertices) in G° — S. Hence there is a set S C ^(G ) such that 
|F(G°)| + |5| - oc(G° -S)< 2k. It follows that 

\V(G°)\<oc(G° - S)-\S\ + 2k-l. (3) 

We will now classify odd components in G° — S. One obvious type of odd compo- 
nents is an isolated vertex in G° of weight zero, which corresponds to an insignificant 
variable by Lemma [9] All the other odd components can be categorized into one of 
the following two types: 

1. Let Qi, . . . , Ql be the odd components of G° — S such that for all 1 < i < L 
we have \Qi\ — 1 and Qi is a significant variable. 

2. Let Qi, . . . , Q' L , be the odd components of G° — S such that for all 1 < i < L' 
we have \Q[\ > 1. 

We construct a collection of induced stars as follows. From each of Q' 1; . . . , Q' L , 
we choose an edge, which is an induced star with |/| = 1. Let us consider Qx, ■ ■ ■ , Ql- 
Each vertex Qi is adjacent to at least one vertex of S. Thus, we can partition 
Qi, . . . , Ql into \ S\ sets, some of them possibly empty, such that each partite set forms 
an independent set in which every vertex is adjacent to the corresponding vertex Xj 
of S. Each partite set, together with Xi, forms an induced star. Now observe that 
we have a collection of induced stars and the total number of edges equals L + V . If 
L+L 1 > k, LemmafTSlimplies that for some set X of vertices from the odd components 
wx{Q) > k, which is impossible by Lemma [TT] Hence L + L' < k — 1. 

Therefore, oc(G° — S) — n' = L + L' < k—1, where n' is the number of insignificant 
variables. By ©, we have \V(G°)\ - ri < k - 1 - \S\ + 2k - 1 < 3fc - 2. It remains 
to observe that |F(G°)| — n' equals the number of significant variables of F. This 
completes the proof of Theorem [2] 

Corollary 2. The problem Max-2-Sat tlb admits a (polynomial time) reduction to 
a problem kernel with at most 3k — 1 variables. 
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Proof. Consider an instance (F, k) of the problem. First we apply the semicomplete 
reduction and obtain (in polynomial time) an instance (F',k) with F' = F s . We 
determine (again in polynomial time) the set S' of significant variables of F 1 . If 
| S' | > 3fc — 2 then (F' , k) is a yes- instance by Theorem [21 and consequently (F, k) is 
a yes-instance by Lcmma[7] Assume now that \S'\ < 3k — 2. 

Let z be a new variable not occurring in F. Since F' — F s , no clause contains 
two insignificant variables and, thus, each insignificant variable can be replaced by z 
without changing the solution to (F' , k). Let us denote the modified F' by F"; F" 
has at most 3fc — 1 variables. 

Let p be the number of clauses in F" . Observe that we can find a truth assignment 
satisfying the maximum number of clauses of F" in time 0(p8 k ). Thus, if p > 8 , 
we can find the optimal truth assignment in the polynomial time 0(p 2 ) = 0(m 2 ). 
Thus, we may assume that F" has at most 8 fc clauses. Therefore, F" is a kernel of 
the Max-2-Sat TL b problem. □ 



6 Concluding Remarks 

• Our algorithm for the parameterized MAX-r-SAT problem can be easily modified 
to provide, efficiently, for any given instance of m clauses to which there is a truth 
assignment satisfying at least k/2 r clauses above the average, an assignment for the 
variables with this property. Indeed, the proof of Theorem [1] only requires that the 
variables Xi are 4r-wise independent, and there are known constructions of polynomial 
size sample spaces supporting such random variables (see, e.g., [2], Chapter 16). Thus, 
if in the polynomial X, y/\S\/(2 ■ 8 r ) > k, then one can find an assignment satisfying 
at least as many clauses as needed by going over all points in such a sample space, 
and if not, one can solve the problem by an exhaustive search. 

• The fixed-parameter tractability result on MAX-r-SAT can be easily extended to 
any family of Boolean r-Constraint Satisfaction Problems. Here is an outline of the 
argument. 

Let r be a fixed positive integer, let $ be a set of Boolean functions, each involving 
at most r variables, and let T = {fx, /2, . . . , f m } be a collection of Boolean functions, 
each being a member of <!>, and each acting on some subset of the n Boolean variables 
xx,x%, . . . ,x n . The Boolean Max-r-Constraint Satisfaction Problem (corresponding 
to <&), which we denote by the MAX-r-CSP problem, for short, when $ is clear 
from the context, is the problem of finding a truth assignment to the variables so as 
to maximize the total number of functions satisfied. Note that this includes, as a 
special case, the MAX-r-SAT problem considered in the previous section, as well as 
many related problems. As most interesting problems of this type are NP-hard, we 
consider their parameterized version, where the parameter is, as before, the number 
of functions satisfied minus the expected value of this number. Note, in passing, that 
the above expected value is a tight lower bound for the problem, whenever the family 
$ is closed under replacing each variable by its complement, since if we apply any 
Boolean function to all 2 r choices of literals whose underlying variables are any fixed 
set of r variables, then any truth assignment to the variables satisfies exactly the same 
number of these 2 r functions. 
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For each Boolean function / of r Boolean variables 

define a random variable Xf as follows. As in the discussion of the MAX-r-SAT 
problem, suppose each variable Xi- attains values in { — 1,1}. Let V C {— 1, l} 7 * 
denote the set of all satisfying assignments of /. Then 

X f (x 1 ,x 2 , ■ ■ ■ ,x n ) = ^2 [JJ(1 + x ij v j ) - 1]. 

tl=(«l,...,«,)£V j=l 

This is a random variable defined over the space {—1, 1}™ and its value at x = 
(xi, X2, ■ ■ ■ , x n ) is 2 r — \V\ if x satisfies /, and is — \V\ otherwise. Thus, the expectation 
of Xf is zero. Define now X = Y^feF ^f. Then the value of X at x = (xi, X2, ■ ■ ■ , x n ) 
is precisely 2 r (s — a), where s is the number of the functions satisfied by the truth 
assignment x, and a is the average value of the number of satisfied functions. Our 
objective is to decide if X attains a value of at least k. As this is a polynomial of degree 
at most r with integer coefficients and expectation zero, we can repeat the arguments 
of Section0]and prove that, for every fixed r, the problem is fixed-parameter tractable. 
Moreover, our previous arguments show that the problem admits a polynomial-size 
bikernel reducing it to an instance of MAX-r-LiN2 TLB of size 0(fc 2 ), and if the specific 
r-CSP problem considered is NP-complete, then there is a polynomial size kernel. 
This is the case for most interesting choices of the family $. 
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